Skip to content

feat(image): cross-platform clipboard paste, SVG support, and multi-model image handling#197

Merged
im10furry merged 1 commit into
mainfrom
feat/cross-platform-image-support
Jun 9, 2026
Merged

feat(image): cross-platform clipboard paste, SVG support, and multi-model image handling#197
im10furry merged 1 commit into
mainfrom
feat/cross-platform-image-support

Conversation

@im10furry

Copy link
Copy Markdown
Collaborator

Summary

Cross-platform image handling: clipboard paste on all platforms, SVG support, and multi-model image passthrough.

New Modules

src/utils/image/media.ts

  • detectImageMediaType() - magic byte detection for PNG/JPEG/GIF/WebP
  • normalizeSupportedImageMediaType() - MIME normalization (image/jpg to image/jpeg)
  • isSvgBuffer() / isSvgExtension() - SVG detection
  • rasterizeSvgToPng() - SVG to PNG via sharp
  • imageBase64ToDataUrl() / imageBufferToDataUrl() - base64 to data: URL

src/utils/model/visionContent.ts

  • extractTextAndImageUrls() - split text and images from mixed content arrays
  • getImageUrlFromPart() - extract data: URLs from Anthropic, image_url, and input_image blocks
  • toOpenAIImageUrlParts() / toResponsesImageParts() - format converters

Enhancements

Module Change
FileReadTool SVG support, detect media type from magic bytes not extension
imagePaste.ts Windows (PowerShell) and Linux (wl-paste/xclip) support
ChatCompletions Tool result images become adjacent user vision messages
ResponsesAPI User images become input_image, tool result images become function_call_output
openaiMessageConversion Tool result images become adjacent user vision messages
UI ClipboardImage struct preserves actual media type

Tests (+13 cases)

Test file Cases
image-media.test.ts Magic byte detection, MIME normalization, data URL conversion
process-user-input-images.test.ts Pasted image preserves JPEG media type
file-read-tool-parity.test.ts SVG rasterization, JPEG/GIF/WebP content detection
chat-completions-e2e.test.ts Tool result images to user vision message
responses-api-e2e.test.ts User/tool images to input_image/function_call_output
openai-message-conversion.test.ts Tool result images to user vision message

Gaps Closed

  • Clipboard paste now works on Windows and Linux
  • SVG files readable as images (rasterized to PNG)
  • OpenAI Chat Completions and Responses API pass images through to model

…odel image handling

- Add src/utils/image/media.ts: magic-byte detection, MIME normalization, SVG rasterization
- Add src/utils/model/visionContent.ts: extract text+images from mixed content, convert formats
- FileReadTool: SVG rasterized to PNG, detect media type from magic bytes instead of extension
- imagePaste.ts: now supports Windows (PowerShell) and Linux (wl-paste/xclip) in addition to macOS
- ClipboardImage struct replaces raw base64 string, preserving actual media type
- ChatCompletions adapter: tool result images become adjacent user vision messages
- ResponsesAPI adapter: user images become input_image, tool result images become function_call_output
- openaiMessageConversion: tool result images become adjacent user vision messages with MIME preserved
- UI: onImagePaste now receives ClipboardImage struct for accurate media type
- Tests: 13 new cases covering media detection, SVG, chat-completions, responses-api, message conversion
@im10furry im10furry merged commit 60d98dd into main Jun 9, 2026
3 checks passed
@im10furry im10furry deleted the feat/cross-platform-image-support branch June 9, 2026 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant